Discourse Constraints for Document Compression
نویسندگان
چکیده
Sentence compression holds promise for many applications ranging from summarization to subtitle generation. The task is typically performed on isolated sentences without taking the surrounding context into account, even though most applications would operate over entire documents. In this article we present a discourse-informed model which is capable of producing document compressions that are coherent and informative. Our model is inspired by theories of local coherence and formulated within the framework of integer linear programming. Experimental results show significant improvements over a state-of-the-art discourse agnostic approach.
منابع مشابه
Global inference for sentence compression : an integer linear programming approach
In this thesis we develop models for sentence compression. This text rewriting task has recently attracted a lot of attention due to its relevance for applications (e.g., summarisation) and simple formulation by means of word deletion. Previous models for sentence compression have been inherently local and thus fail to capture the long range dependencies and complex interactions involved in tex...
متن کاملModelling Compression with Discourse Constraints
Sentence compression holds promise for many applications ranging from summarisation to subtitle generation. The task is typically performed on isolated sentences without taking the surrounding context into account, even though most applications would operate over entire documents. In this paper we present a discourse informed model which is capable of producing document compressions that are co...
متن کاملA Noisy-Channel Model for Document Compression
We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hierarchical model of text production in order to drop non-important syntactic and discourse constit...
متن کاملMethodology for Validation of Issuance of Mystical and Ethical Narrations (A Case Study and Discourse Analysis on the Methodology of the Book Sirr ul-asra’)
The Book “the Secret of Prophet Mohammad’s Midnight Journey to the Seven Heavens in Explanation of Al-Mi’raj Hadith” is written by Ayatollah Sa’adatparvar. Analyzing the discourse of a part of its introduction, his recognition method about this hadith has been investigated in this paper. The paper aims at investigating the particular discourse pattern of the author in analyzing the document of ...
متن کاملTesting Structural Properties in Textual Data: Beyond Document Grammars
This article describes research carried out in the project "Secondary information structuring and comparative discourse analysis" (SEKIMO), which is part of the research group "Texttechnological modeling of information" and is funded by the German Research Council (DFG). In our project, we use XML document grammars, i.e. DTDs (Bray et al., 2000), XML Schema (Thompson et al., 2001) and Relax NG ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Linguistics
دوره 36 شماره
صفحات -
تاریخ انتشار 2010